1,301 research outputs found
Vision based estimation, localization, and mapping for autonomous vehicles
In this dissertation, we focus on developing simultaneous localization and mapping (SLAM) algorithms with a robot-centric estimation framework primarily using monocular vision sensors. A primary contribution of this work is to use a robot-centric mapping framework concurrently with a world-centric localization method. We exploit the differential equation of motion of the normalized pixel coordinates of each point feature in the robot body frame. Another contribution of our work is to exploit a multiple-view geometry formulation with initial and current view projection of point features. We extract the features from objects surrounding the river and their reflections. The correspondences of the features are used along with the attitude and altitude information of the robot. We demonstrate that the observability of the estimation system is improved by applying our robot-centric mapping framework and multiple-view measurements. Using the robot-centric mapping framework and multiple-view measurements including reflection of features, we present a vision based localization and mapping algorithm that we developed for an unmanned aerial vehicle (UAV) flying in a riverine environment. Our algorithm estimates the 3D positions of point features along a river and the pose of the UAV. Our UAV is equipped with a lightweight monocular camera, an inertial measurement unit (IMU), a magnetometer, an altimeter, and an onboard computer. To our knowledge, we report the first result that exploits the reflections of features in a riverine environment for localization and mapping. We also present an omnidirectional vision based localization and mapping system for a lawn mowing robot. Our algorithm can detect whether the robotic mower is contained in a permitted area. Our robotic mower is modified with an omnidirectional camera, an IMU, a magnetometer, and a vehicle speed sensor. Here, we also exploit the robot-centric mapping framework. The estimator in our system generates a 3D point based map with landmarks. Concurrently, the estimator defines a boundary of the mowing area by using the estimated trajectory of the mower. The estimated boundary and the landmark map are provided for the estimation of the mowing location and for the containment detection. First, we derive a nonlinear observer with contraction analysis and pseudo-measurements of the depth of each landmark to prevent the map estimator from diverging. Of particular interest for this work is ensuring that the estimator for localization and mapping will not fail due to the nonlinearity of the system model. For batch estimation, we design a hybrid extended Kalman smoother for our localization and robot-centric mapping model. Finally, we present a single camera based SLAM algorithm using a convex optimization based nonlinear estimator. We validate the effectiveness of our algorithms through numerical simulations and outdoor experiments
A New Spectral Method in Time Series Analysis
In this dissertation, we propose a new spectral method that could be used to overcome two issues in time series analysis.
The first issue is the small sample problem. The periodogram is widely used to analyze second order stationary time series, since an expectation of the periodogram is approximately equal to the underlying spectral density of the time series. However, it is well known that the periodogram suffers from a finite sample bias. We show that the bias arises because of the finite boundary of observation in the discrete Fourier transforms (DFT), which is used in the construction of the periodogram. Moreover, we show that by using the best linear predictors of the time series outside the observed domain, we can obtain the “complete periodogram" that is an unbiased estimator of the spectral density. We propose a method for estimating the best linear predictors and prove, both theoretically and empirically, that the resulting estimated complete periodogram has a smaller bias than the regular periodogram. The estimated complete periodogram can be used to estimate parameters, which is expressed as a weighted sum of the spectral density.
The second issue is the discrepancy between time and frequency domain methods in parameter estimation. In time series analysis, there is a clear distinction between the two domain methods. We draw connections between two domain methods by deriving an exact and interpretable bound between the Gaussian and Whittle likelihood of a second order stationary time series. The derivation is based on obtaining the transformation, which is biorthogonal to the DFT of the time series. Such a transformation yields a new decomposition for the inverse of a Toeplitz matrix and enables the representation of the Gaussian likelihood within the frequency domain. Based on this result, we obtain an approximation for the difference between the Gaussian and Whittle likelihoods and define two new frequency domain quasi-likelihood criteria. We show that these new criteria are computationally fast and yield a better approximation of the spectral divergence criterion, as compared to both the Gaussian and Whittle likelihoods
Unified Contrastive Fusion Transformer for Multimodal Human Action Recognition
Various types of sensors have been considered to develop human action
recognition (HAR) models. Robust HAR performance can be achieved by fusing
multimodal data acquired by different sensors. In this paper, we introduce a
new multimodal fusion architecture, referred to as Unified Contrastive Fusion
Transformer (UCFFormer) designed to integrate data with diverse distributions
to enhance HAR performance. Based on the embedding features extracted from each
modality, UCFFormer employs the Unified Transformer to capture the
inter-dependency among embeddings in both time and modality domains. We present
the Factorized Time-Modality Attention to perform self-attention efficiently
for the Unified Transformer. UCFFormer also incorporates contrastive learning
to reduce the discrepancy in feature distributions across various modalities,
thus generating semantically aligned features for information fusion.
Performance evaluation conducted on two popular datasets, UTD-MHAD and NTU
RGB+D, demonstrates that UCFFormer achieves state-of-the-art performance,
outperforming competing methods by considerable margins
Diffusion Video Autoencoders: Toward Temporally Consistent Face Video Editing via Disentangled Video Encoding
Inspired by the impressive performance of recent face image editing methods,
several studies have been naturally proposed to extend these methods to the
face video editing task. One of the main challenges here is temporal
consistency among edited frames, which is still unresolved. To this end, we
propose a novel face video editing framework based on diffusion autoencoders
that can successfully extract the decomposed features - for the first time as a
face video editing model - of identity and motion from a given video. This
modeling allows us to edit the video by simply manipulating the temporally
invariant feature to the desired direction for the consistency. Another unique
strength of our model is that, since our model is based on diffusion models, it
can satisfy both reconstruction and edit capabilities at the same time, and is
robust to corner cases in wild face videos (e.g. occluded faces) unlike the
existing GAN-based methods.Comment: CVPR 2023. Our project page: https://diff-video-ae.github.i
Pixel data real time processing as a next step for HL-LHC upgrades and beyond
The experiments at LHC are implementing novel and challenging detector
upgrades for the High Luminosity LHC, among which the tracking systems. This
paper reports on performance studies, illustrated by an electron trigger, using
a simplified pixel tracker. To achieve a real-time trigger (e.g. processing
HL-LHC collision events at 40 MHz), simple algorithms are developed for
reconstructing pixel-based tracks and track isolation, utilizing look-up tables
based on pixel detector information. Significant gains in electron trigger
performance are seen when pixel detector information is included. In
particular, a rate reduction up to a factor of 20 is obtained with a signal
selection efficiency of more than 95\% over the whole coverage of this
detector. Furthermore, it reconstructs p-p collision points in the beam axis
(z) direction, with a high precision of 20 m resolution in the very
central region (), and, up to 380 m in the forward region
(2.7 3.0). This study as well as the results can easily be adapted
to the muon case and to the different tracking systems at LHC and other
machines beyond the HL-LHC. The feasibility of such a real-time processing of
the pixel information is mainly constrained by the Level-1 trigger latency of
the experiment. How this might be overcome by the Front-End ASIC design, new
processors and embedded Artificial Intelligence algorithms is briefly tackled
as well.Comment: To be submitted to JHE
Survey of Public Attitudes toward the Secondary Use of Public Healthcare Data in Korea
Objectives Public healthcare data have become crucial to the advancement of medicine, and recent changes in legal structure on privacy protection have expanded access to these data with pseudonymization. Recent debates on public healthcare data use by private insurance companies have shown large discrepancies in perceptions among the general public, healthcare professionals, private companies, and lawmakers. This study examined public attitudes toward the secondary use of public data, focusing on differences between public and private entities. Methods An online survey was conducted from January 11 to 24, 2022, involving a random sample of adults between 19 and 65 of age in 17 provinces, guided by the August 2021 census. Results The final survey analysis included 1,370 participants. Most participants were aware of health data collection (72.5%) and recent changes in legal structures (61.4%) but were reluctant to share their pseudonymized raw data (51.8%). Overall, they were favorable toward data use by public agencies but disfavored use by private entities, notably marketing and private insurance companies. Concerns were frequently noted regarding commercial use of data and data breaches. Among the respondents, 50.9% were negative about the use of public healthcare data by private insurance companies, 22.9% favored this use, and 1.9% were “very positive.” Conclusions This survey revealed a low understanding among key stakeholders regarding digital health data use, which is hindering the realization of the full potential of public healthcare data. This survey provides a basis for future policy developments and advocacy for the secondary use of health data
- …